Job Radar. Live notifications. AI processed.
freelancer.com 2026-04-23 🟠
🔹 Refactor raw job descriptions into a structured Technical Manifest (JSON)
👤 Client: 🇮🇳 Kolkata, India Member since 2026-04-22
💰 Price: $14 / hr Average bid
🚩 Problem: Automate data extraction from a news site to generate structured content for analysis.
📦 Existing: Not specified
Specifications:
[Target] - News articles with headline, full text, author name, publication date, and image URLs
[Method] - Python script using Requests/BeautifulSoup or Scrapy for web scraping
[UI/UX] - N/A
[Stack] - Python (Requests/BeautifulSoup or Scrapy), CSV/JSON output, SQLite database
[Security] - Respect [robots.txt], implement rate-limiting
[Format] - Clean JSON/Csv with deduplication
Workflow:
Analyze the news site structure and identify key elements (headlines, article text, author name, date, images)
Choose a Python web scraping library (Requests/BeautifulSoup or Scrapy) based on complexity
Write initial script to navigate through latest articles section and extract required data
Implement pagination handling and rate-limiting logic
Test the script against [robots.txt] and ensure compliance with site policies
Output clean, de-duplicated data in JSON/Csv format and optionally store in SQLite database
Document setup instructions and include comments for future modifications